In this companion guide, we illustrate scenarios in which difference in differences analysis can be applied. We start with the simplest of scenarios and slowly amp up the complexity. For each scenario we describe a target parameter to estimate and the parameter estimated by a two-way fixed effects model. We also apply the Goodman-Bacon decomposition to this parameter to determine if the TWFE estimate is partially composed of estimates that are “forbidden” (e.g., ones that compare a newly treated state to a previously treated state) and apply the Callaway and Sant’Anna approach to estimation.
The goal is to illustrate scenarios where the usual TWFE method of estimation provides suitable results and scenarios where this approach is biased. In the latter case, we show alternative methods to estimation to overcome the issue.
First, load the packages for reading and plotting the data. The packages bacondecomp and did are specific to DID analyses.
# you will need to install these packages if you don't have them already
# install.packages(c("here", "readxl", "tidyverse", "patchwork", "magrittr",
# "broom", "ggrepel", "bacondecomp", "did"))
library(here)
library(readxl)
library(tidyverse)
library(patchwork)
library(magrittr)
library(broom)
library(ggrepel)
library(bacondecomp)
library(did)
We first consider the simplest case in which there is one never-treated group, one group that undergoes treatment. The groups are observed during two time periods only.
Below, the data are read in from an excel spreadsheet. Feel free to view the data in the spreadsheet or use an R View()er window. The data contain four variables state, time, policy, and outcome.
state: ID for the grouping variabletime: Time index, begins at 1policy: Indicator variable for exposure to the policy changeoutcome: The outcome variables1 <- read_xlsx(here("data", "scenarios.xlsx"), sheet = "scen1",
col_types = c("text", "text", "text", "numeric"))
# str(s1)
# View(s1)
Note that I specified the column types for each variable. I read in the state, time, and policy variables as “text” so that R will consider them as categorical/factor variables in the analysis (or, using econ-speak using “fixed effects”).
The following can be read from the labelled plot:
Assumptions
Under the canonical TWFE design, we can estimate the policy effect by including a fixed effect (FE) for state, a FE for time, and an indicator for the policy change. The effect estimate is the regression coefficient on the policy variable.
s1_mod <- lm(outcome ~ state + time + policy, data = s1)
tidy(s1_mod)
## # A tibble: 4 × 5
## term estimate std.error statistic p.value
## <chr> <dbl> <dbl> <dbl> <dbl>
## 1 (Intercept) 9 NaN NaN NaN
## 2 state2 1.00 NaN NaN NaN
## 3 time2 3 NaN NaN NaN
## 4 policy1 2.00 NaN NaN NaN
The coefficient on the policy term is 2 showing that the regression model estimate equalled the causal effect.
Takeaway: When there are only two treatment groups and two time points, where one of the groups becomes treated, you can calculate the DID estimate by hand or using TWFE regression. Note that there are only 4 rows of data and no sampling variability so no measures of variation can be estimated in this setting.
This scenario is similar to Scenario 1, except now there are two states that undergo treatment. In Scenario 2A, the magnitude of the treatment effect is the same in both states. In Scenrio 2B, the magnitude of the treatment effect varies by state.
s2a <- read_xlsx(here("data", "scenarios.xlsx"), sheet = "scen2a",
col_types = c("text", "text", "text", "numeric"))
s2b <- read_xlsx(here("data", "scenarios.xlsx"), sheet = "scen2b",
col_types = c("text", "text", "text", "numeric"))
The following can be read from the labelled plot for Scenario 2A:
The following can be read from the labelled plot for Scenario 2B:
s2a_mod <- lm(outcome ~ policy + state + time, data = s2a)
tidy(s2a_mod)
## Warning in summary.lm(x): essentially perfect fit: summary may be unreliable
## # A tibble: 5 × 5
## term estimate std.error statistic p.value
## <chr> <dbl> <dbl> <dbl> <dbl>
## 1 (Intercept) 9 0 Inf 0
## 2 policy1 2.00 0 Inf 0
## 3 state2 1.00 0 Inf 0
## 4 state3 2.00 0 Inf 0
## 5 time2 3 0 Inf 0
The coefficient estimate equals to 2 the ATT for Scenario 2a.
s2b_mod <- lm(outcome ~ policy + state + time, data = s2b)
tidy(s2b_mod)
## # A tibble: 5 × 5
## term estimate std.error statistic p.value
## <chr> <dbl> <dbl> <dbl> <dbl>
## 1 (Intercept) 9 0.500 18.0 0.0353
## 2 policy1 2.5 0.866 2.89 0.212
## 3 state2 0.750 0.661 1.13 0.460
## 4 state3 2.25 0.661 3.40 0.182
## 5 time2 3 0.707 4.24 0.147
The coefficient estimate equals to 2.5 the ATT for Scenario 2b.
Takeaway: When there are multiple treatment groups and two time points, where >1 of the groups becomes treated, you can calculate the DID estimate by hand or using TWFE regression. If treatment effects are heterogeneous across states, then the estimated effect is the average treatment effect (ATE) across the states
Things to discuss with Dana:
In scenario 3, the number of time periods is increased to incorporate 5 time points before and after a policy change. One never-treated and one treated group are considered. In Scenario 3A, the effect of treatment is constant once treatment is introduced. In Scenario 3B, the effect of treatment increases with time (i.e., it is dynamic).
Because there are multiple time points, when we read in the data we update the column type for the time variable to be numeric. This will help with plotting the data. Later, when we run the TWFE model, we will use factor(time) to model time using indicator variables (i.e., time fixed effects) rather than including time as a linear term.
s3a <- read_xlsx(here("data", "scenarios.xlsx"), sheet = "scen3a",
col_types = c("text", "numeric", "text", "numeric", "text"))
s3b <- read_xlsx(here("data", "scenarios.xlsx"), sheet = "scen3b",
col_types = c("text", "numeric", "text", "numeric", "text"))
In addition to state, time, policy, and outcome, the data contains another variable time_since_policy which equals 0 before treatment (and always equals 0 for the never-treated), and counts the time since treatment in the treated state.
## `summarise()` has grouped output by 'state'. You can override using the `.groups` argument.
## `summarise()` has grouped output by 'state'. You can override using the `.groups` argument.
The following can be read from the labelled plot for Scenario 3A:
The following can be read from the labelled plot for Scenario 3B:
For Scenario 3B the effect is dynamic, and rather than calculating an ATT that aggregates over post-time, it might be beneficial to calculate the effect separately by time since treatment. To do that, we could label the outcomes separately by time since treatment and calculate a separate DID for every time point since treatment. The average of these DIDs will equal the 6 that we calculated using the average outcome over post-treatment time.
s3a_mod <- lm(outcome ~ policy + state + factor(time),
data = s3a)
tidy(s3a_mod)
## Warning in summary.lm(x): essentially perfect fit: summary may be unreliable
## # A tibble: 12 × 5
## term estimate std.error statistic p.value
## <chr> <dbl> <dbl> <dbl> <dbl>
## 1 (Intercept) 9.00 4.39e-15 2.05e15 3.62e-120
## 2 policy1 2.00 5.07e-15 3.94e14 1.92e-114
## 3 state2 2.00 3.59e-15 5.57e14 1.20e-115
## 4 factor(time)2 3.00 5.67e-15 5.29e14 1.83e-115
## 5 factor(time)3 6.00 5.67e-15 1.06e15 7.16e-118
## 6 factor(time)4 9 5.67e-15 1.59e15 2.79e-119
## 7 factor(time)5 12 5.67e-15 2.12e15 2.80e-120
## 8 factor(time)6 15 6.21e-15 2.41e15 9.73e-121
## 9 factor(time)7 18 6.21e-15 2.90e15 2.26e-121
## 10 factor(time)8 21 6.21e-15 3.38e15 6.59e-122
## 11 factor(time)9 24 6.21e-15 3.86e15 2.26e-122
## 12 factor(time)10 27 6.21e-15 4.34e15 8.83e-123
The coefficient of the policy term is 2 showing that the regression model estimate equaled the causal effect.
s3b_mod <- lm(outcome ~ policy + state + factor(time),
data = s3b)
tidy(s3b_mod)
## # A tibble: 12 × 5
## term estimate std.error statistic p.value
## <chr> <dbl> <dbl> <dbl> <dbl>
## 1 (Intercept) 9.00 1.22 7.35 0.0000801
## 2 policy1 6 1.41 4.24 0.00283
## 3 state2 2.00 1.00 2.00 0.0805
## 4 factor(time)2 3.00 1.58 1.90 0.0943
## 5 factor(time)3 6.00 1.58 3.79 0.00528
## 6 factor(time)4 9.00 1.58 5.69 0.000459
## 7 factor(time)5 12 1.58 7.59 0.0000637
## 8 factor(time)6 13.0 1.73 7.51 0.0000689
## 9 factor(time)7 17 1.73 9.81 0.00000976
## 10 factor(time)8 21 1.73 12.1 0.00000198
## 11 factor(time)9 25.0 1.73 14.4 0.000000519
## 12 factor(time)10 29.0 1.73 16.7 0.000000164
The coefficient of the policy term is 6 showing that the regression model estimate equaled the causal effect.
time_since_policy specificationRather than using a binary indicator for the policy change, it can be coded using the time_since_policy variable. If we include this as a factor variable in the model, then separate effects will be estimated for each time since treatment.
For Scenario 3A, using this variable should yield 2 for each time since treatment, since the treatment effect is constant over time.
s3a_mod2 <- lm(outcome ~ time_since_policy + state + factor(time),
data = s3a)
tidy(s3a_mod2)
## Warning in summary.lm(x): essentially perfect fit: summary may be unreliable
## # A tibble: 16 × 5
## term estimate std.error statistic p.value
## <chr> <dbl> <dbl> <dbl> <dbl>
## 1 (Intercept) 9.00 7.57e-15 1.19e15 3.00e-60
## 2 time_since_policy1 2.00 1.51e-14 1.32e14 1.97e-56
## 3 time_since_policy2 2.00 1.51e-14 1.32e14 1.97e-56
## 4 time_since_policy3 2.00 1.51e-14 1.32e14 1.97e-56
## 5 time_since_policy4 2.00 1.51e-14 1.32e14 1.97e-56
## 6 time_since_policy5 2.00 1.51e-14 1.32e14 1.97e-56
## 7 state2 2.00 6.18e-15 3.24e14 5.47e-58
## 8 factor(time)2 3.00 9.77e-15 3.07e14 6.76e-58
## 9 factor(time)3 6.00 9.77e-15 6.14e14 4.22e-59
## 10 factor(time)4 9 9.77e-15 9.21e14 8.34e-60
## 11 factor(time)5 12 9.77e-15 1.23e15 2.64e-60
## 12 factor(time)6 15.0 1.24e-14 1.21e15 2.77e-60
## 13 factor(time)7 18.0 1.24e-14 1.46e15 1.33e-60
## 14 factor(time)8 21.0 1.24e-14 1.70e15 7.20e-61
## 15 factor(time)9 24 1.24e-14 1.94e15 4.22e-61
## 16 factor(time)10 27.0 1.24e-14 2.18e15 2.64e-61
Indeed, this is what we see!
For Scenario 3B, the effect increases by 2 units for every unit of time since treatment is initiated so this should be reflected in the coefficient estimates.
s3b_mod2 <- lm(outcome ~ time_since_policy + state + factor(time),
data = s3b)
tidy(s3b_mod2)
## Warning in summary.lm(x): essentially perfect fit: summary may be unreliable
## # A tibble: 16 × 5
## term estimate std.error statistic p.value
## <chr> <dbl> <dbl> <dbl> <dbl>
## 1 (Intercept) 9.00 1.05e-14 8.56e14 1.12e-59
## 2 time_since_policy1 2.00 2.10e-14 9.51e13 7.33e-56
## 3 time_since_policy2 4.00 2.10e-14 1.90e14 4.58e-57
## 4 time_since_policy3 6.00 2.10e-14 2.85e14 9.05e-58
## 5 time_since_policy4 8.00 2.10e-14 3.80e14 2.86e-58
## 6 time_since_policy5 10.0 2.10e-14 4.76e14 1.17e-58
## 7 state2 2.00 8.58e-15 2.33e14 2.04e-57
## 8 factor(time)2 3.00 1.36e-14 2.21e14 2.51e-57
## 9 factor(time)3 6.00 1.36e-14 4.42e14 1.57e-58
## 10 factor(time)4 9.00 1.36e-14 6.63e14 3.10e-59
## 11 factor(time)5 12.0 1.36e-14 8.84e14 9.82e-60
## 12 factor(time)6 15.0 1.72e-14 8.74e14 1.03e-59
## 13 factor(time)7 18.0 1.72e-14 1.05e15 4.97e-60
## 14 factor(time)8 21.0 1.72e-14 1.22e15 2.68e-60
## 15 factor(time)9 24 1.72e-14 1.40e15 1.57e-60
## 16 factor(time)10 27.0 1.72e-14 1.57e15 9.81e-61
For this scenario, the effect estimate is 2 in the first time after treatment, then 4, and so on.
Note that we could have modeled time_since_policy linearly, however including time_since_policy as a factor variable allows the estimation of effects to be non-linear over time.
Scenario 4 extends Scenario 3 to the multiple group setting. State 1 is never-treated, while states 2-4 undergo treatment at time=6. The effects are not dynamic in time, but they are heterogeneous with some states having a larger treatment effect
s4 <- read_xlsx(here("data", "scenarios.xlsx"), sheet = "scen4",
col_types = c("text", "numeric", "text", "numeric", "text", "numeric"))
## `summarise()` has grouped output by 'state'. You can override using the `.groups` argument.
The following can be read from the labelled plot:
Pre-post difference for never-treated state 1: 30-15 = 15
Pre-post difference for treated state 2: 34-17 = 17
Pre-post difference for treated state 3: 35-19 = 16
Pre-post difference for treated state 4: 39-21 = 18
\(DID_{2 vs. 1} = 17-15 = 2\)
\(DID_{3 vs. 1} = 16-15 = 1\)
\(DID_{4 vs. 1} = 18-15 = 3\)
\(ATT = 2\)
s4_mod <- lm(outcome ~ policy + state + factor(time),
data = s4)
tidy(s4_mod)
## # A tibble: 14 × 5
## term estimate std.error statistic p.value
## <chr> <dbl> <dbl> <dbl> <dbl>
## 1 (Intercept) 9.00 0.277 32.4 1.45e-22
## 2 policy1 2.00 0.320 6.24 1.31e- 6
## 3 state2 2.00 0.253 7.90 2.24e- 8
## 4 state3 3.50 0.253 13.8 1.71e-13
## 5 state4 6.50 0.253 25.7 5.34e-20
## 6 factor(time)2 3.00 0.310 9.67 4.20e-10
## 7 factor(time)3 6.00 0.310 19.3 5.83e-17
## 8 factor(time)4 9 0.310 29.0 2.44e-21
## 9 factor(time)5 12 0.310 38.7 1.63e-24
## 10 factor(time)6 15.0 0.392 38.2 2.20e-24
## 11 factor(time)7 18.0 0.392 45.9 2.06e-26
## 12 factor(time)8 21.0 0.392 53.5 3.89e-28
## 13 factor(time)9 24.0 0.392 61.2 1.24e-29
## 14 factor(time)10 27.0 0.392 68.8 5.91e-31
The coefficient of the policy term is 2 showing that the regression model estimates the ATT.
One of the issues with TWFE estimation is that contrasts between earlier-treated and later-treated states contribute to the coefficient estimate of the policy term. While all states are treated at the same time in Scenario 4, we can use the Goodman-Bacon decomposition function bacon() to partition the effect estimate into weighted component parts.
Before using the bacon() function, we need to modify some of the variables that are stored as characters by R to be stored as numeric:
s4 %<>% mutate(state_n = as.numeric(as.character(state)),
policy_n = as.numeric(as.character(policy)))
The bacon function has an argument called formula, where you list the outcome as a function of the policy change, i.e., formula = outcome ~ policy_n. Note that no fixed effects are included in the formula. The time and state fixed effects are specified by the time_varand id_var arguments.
s4_bacon <- bacon(formula = outcome ~ policy_n,
data = s4,
id_var = "state_n",
time_var = "time")
## type weight avg_est
## 1 Treated vs Untreated 1 2
This small table illustrates that 100% of the weight is put on comparisons between treated and untreated states and the ATT equals 2.
For more detail we can print the object:
s4_bacon
## treated untreated estimate weight type
## 2 6 99999 2 1 Treated vs Untreated
treated=6 says that all treated states start treatment at time = 6 and are compared to untreated states (with a fictitious treatment time of 99999).
The results from the Goodman-Bacon Decomposition are reassuring and tell us we can trust the TWFE estimate. For pedagodgical purposes, we also show the estimate if we applied the Callaway Sant’Anna estimator using the att_gt function to estimate group-time ATEs.
Like bacon(), att_gt() requires all numeric variables. It also requires an argument called gname. gname expects a variable which encodes for each state the time of first policy change. For never-treated state 1, this variable equals 0 and for treated states 2-4 this variable equals 6.
s4_cs <- att_gt(yname = "outcome",
tname = "time",
idname = "state_n",
gname = "time_first_treat",
data = s4,
anticipation = 0)
## Warning in pre_process_did(yname = yname, tname = tname, idname = idname, : Be aware that there are some small groups in your dataset.
## Check groups: 0,6.
## Warning in att_gt(yname = "outcome", tname = "time", idname = "state_n", : Not
## returning pre-test Wald statistic due to singular covariance matrix
m2_ag <- aggte(s4_cs, type="simple")
summary(m2_ag)
##
## Call:
## aggte(MP = s4_cs, type = "simple")
##
## Reference: Callaway, Brantly and Pedro H.C. Sant'Anna. "Difference-in-Differences with Multiple Time Periods." Forthcoming at the Journal of Econometrics <https://arxiv.org/abs/1803.09015>, 2020.
##
##
## Overall ATT:
## ATT Std. Error [95% Conf. Int.]
## 2 0 2 2 *
##
##
## ---
## Signif. codes: `*' confidence band does not cover 0
##
## Control Group: Never Treated, Anticipation Periods: 0
## Estimation Method: Doubly Robust
Here, we see the ATT is equal to as estimated by the TWFE model.
Scenario 5 expands upon Scenario 4 by introducing staggered timing. In this example, the treated states introduce the policy at two different time points. The treatment effects are non-dynamic and there are two never-treated states and two ever-treated states.
s5 <- read_xlsx(here("data", "scenarios.xlsx"), sheet = "scen5",
col_types = c("text", "numeric", "text", "numeric", "text", "numeric"))
Visualization of all comparisons made by TWFE regression
Contrast 1:
Contrast 2:
Contrast 3:
Contrast 4:
s5_mod <- lm(outcome ~ policy + state + factor(time), data = s5)
tidy(s5_mod)
## Warning in summary.lm(x): essentially perfect fit: summary may be unreliable
## # A tibble: 24 × 5
## term estimate std.error statistic p.value
## <chr> <dbl> <dbl> <dbl> <dbl>
## 1 (Intercept) 9.00 1.65e-14 5.46e14 0
## 2 policy1 3.00 1.31e-14 2.30e14 0
## 3 state2 1.00 9.43e-15 1.06e14 0
## 4 state3 3.00 1.41e-14 2.13e14 0
## 5 state4 9.00 1.11e-14 8.10e14 0
## 6 factor(time)2 3.00 2.11e-14 1.42e14 0
## 7 factor(time)3 6.00 2.11e-14 2.85e14 0
## 8 factor(time)4 9.00 2.11e-14 4.27e14 0
## 9 factor(time)5 12 2.13e-14 5.63e14 0
## 10 factor(time)6 15.0 2.13e-14 7.03e14 0
## # … with 14 more rows
The coefficient of the policy term is 3 showing that the regression model estimates the ATT.
Using the bacon()function, we can see see the four comparisons being made and the effect estimates for each comparison. The weight column in the output shows how much weight each estimate contributes to the ATT. Here, the treated vs untreated comparisons are the most heavily weighted.
s5 %<>% mutate(state_n = as.numeric(as.character(state)),
policy_n = as.numeric(as.character(policy)))
#bacon decomp says no problems
#only comparing treated vs untreated and the average effect estimate is 2
s5_bacon <- bacon(outcome ~ policy_n,
data = s5,
id_var = "state_n",
time_var = "time")
## type weight avg_est
## 1 Earlier vs Later Treated 0.06715 3
## 2 Later vs Earlier Treated 0.15108 3
## 3 Treated vs Untreated 0.78177 3
We can also see a list of each comparison and it’s weight:
s5_bacon
## treated untreated estimate weight type
## 2 5 99999 3 0.30695444 Treated vs Untreated
## 3 12 99999 3 0.47482014 Treated vs Untreated
## 6 12 5 3 0.15107914 Later vs Earlier Treated
## 8 5 12 3 0.06714628 Earlier vs Later Treated
For pedagogical purposes, we also show the estimate if we applied the Callaway Sant’Anna estimator using the att_gt() function to estimate group-time ATTs.
s5_cs <- att_gt(yname = "outcome",
tname = "time",
idname = "state_n",
gname = "time_first_treat",
data = s5,
anticipation = 0)
## Warning in pre_process_did(yname = yname, tname = tname, idname = idname, : Be aware that there are some small groups in your dataset.
## Check groups: 0,5,12.
## Warning in att_gt(yname = "outcome", tname = "time", idname = "state_n", : Not
## returning pre-test Wald statistic due to singular covariance matrix
s5_cs_ag <- aggte(s5_cs, type="simple")
summary(s5_cs_ag)
##
## Call:
## aggte(MP = s5_cs, type = "simple")
##
## Reference: Callaway, Brantly and Pedro H.C. Sant'Anna. "Difference-in-Differences with Multiple Time Periods." Forthcoming at the Journal of Econometrics <https://arxiv.org/abs/1803.09015>, 2020.
##
##
## Overall ATT:
## ATT Std. Error [95% Conf. Int.]
## 3 0 3 3 *
##
##
## ---
## Signif. codes: `*' confidence band does not cover 0
##
## Control Group: Never Treated, Anticipation Periods: 0
## Estimation Method: Doubly Robust
Here, we see the ATT is equal to 3 as estimated by the TWFE model.
Scenario 6 is similar to Scenario 5 except now the treatment effect increases as time goes on, i.e., it is dynamic. The magnitude of the treatment effect is the same for both treated groups.
s6 <- read_xlsx(here("data", "scenarios.xlsx"), sheet = "scen6",
col_types = c("text", "numeric", "text", "numeric", "text", "numeric"))
Contrast 1:
Contrast 2:
Contrast 3:
Contrast 4:
Contrasts 1 and 2 are okay because they are clean comparisons between never-treated and treated states. Contrasts 3 and 4 are trickier. Contrast 3 compares the earlier treated state to the later treated state. It is okay because parallel trends holds and it only uses the time before the later state is treated to inform the estimation. Contrast 4 is the problem – it uses post-treatment time from state 3 as the “pre” time for the comparison. The issue is that this post time includes the effects of the treatment, and since the effect is dynamic it does not satisfy the parallel trends assumption. However this contrast still contributes to the TWFE regression estimate!
Let’s take a look…
s6_mod <- lm(outcome ~ policy + state + factor(time),
data = s6)
tidy(s6_mod)
## # A tibble: 24 × 5
## term estimate std.error statistic p.value
## <chr> <dbl> <dbl> <dbl> <dbl>
## 1 (Intercept) 8.24 1.28 6.44 2.90e- 8
## 2 policy1 6.80 1.01 6.72 1.02e- 8
## 3 state2 1.00 0.731 1.37 1.77e- 1
## 4 state3 5.96 1.09 5.46 1.11e- 6
## 5 state4 9.09 0.861 10.6 6.23e-15
## 6 factor(time)2 3.00 1.63 1.84 7.17e- 2
## 7 factor(time)3 6.00 1.63 3.67 5.41e- 4
## 8 factor(time)4 9 1.63 5.51 9.51e- 7
## 9 factor(time)5 11.1 1.65 6.68 1.16e- 8
## 10 factor(time)6 14.3 1.65 8.65 6.76e-12
## # … with 14 more rows
The coefficient of the policy term equals 6.8. We can now use the Goodman Bacon decomposition to see how much weight each of the contrasts contributes to the effect estimate:
s6 %<>% mutate(state_n = as.numeric(as.character(state)),
policy_n = as.numeric(as.character(policy)))
s6_bacon <- bacon(outcome ~ policy_n,
data = s6,
id_var = "state_n",
time_var = "time")
## type weight avg_est
## 1 Earlier vs Later Treated 0.06715 6.00000
## 2 Later vs Earlier Treated 0.15108 -1.00000
## 3 Treated vs Untreated 0.78177 8.37423
It’s more helpful to view the four contrasts separately:
s6_bacon
## treated untreated estimate weight type
## 2 5 99999 10.5 0.30695444 Treated vs Untreated
## 3 12 99999 7.0 0.47482014 Treated vs Untreated
## 6 12 5 -1.0 0.15107914 Later vs Earlier Treated
## 8 5 12 6.0 0.06714628 Earlier vs Later Treated
So we can see the estimates of the TE for each contrast are as we calculated them earlier.
We can also confirm that you get the TWFE estimate by taking the weighted average of the Goodman-Bacon decomposition components:
#see the TWFE estimate
sum(s6_bacon$estimate*s6_bacon$weight)
## [1] 6.798561
Okay, so we know the TWFE regression estimate is wrong, in particular because it is incorporating a contrast that we wouldn’t want to make in practice between an earlier treated and a later treated group. To overcome this we can use the Callaway and Sant’Anna estimator.
First, we running the att_gt() function and display its output using summary(). The ATT is estimated for each combination of treated state and time. It also includes estimates for pre-policy time, which helps with the evaluation of the parallel trends assumption or to see if there are any lead effects (“anticipation”) of the treatment on the outcome.
s6_cs <- att_gt(yname = "outcome",
tname = "time",
idname = "state_n",
gname = "time_first_treat",
data = s6,
anticipation = 0)
## Warning in pre_process_did(yname = yname, tname = tname, idname = idname, : Be aware that there are some small groups in your dataset.
## Check groups: 0,5,12.
## Warning in att_gt(yname = "outcome", tname = "time", idname = "state_n", : Not
## returning pre-test Wald statistic due to singular covariance matrix
summary(s6_cs)
##
## Call:
## att_gt(yname = "outcome", tname = "time", idname = "state_n",
## gname = "time_first_treat", data = s6, anticipation = 0)
##
## Reference: Callaway, Brantly and Pedro H.C. Sant'Anna. "Difference-in-Differences with Multiple Time Periods." Forthcoming at the Journal of Econometrics <https://arxiv.org/abs/1803.09015>, 2020.
##
## Group-Time Average Treatment Effects:
## Group Time ATT(g,t) Std. Error [95% Simult. Conf. Band]
## 5 2 0 0 NA NA NA
## 5 3 0 0 NA NA NA
## 5 4 0 0 NA NA NA
## 5 5 3 0 NA NA NA
## 5 6 4 0 NA NA NA
## 5 7 5 0 NA NA NA
## 5 8 6 0 NA NA NA
## 5 9 7 0 NA NA NA
## 5 10 8 0 NA NA NA
## 5 11 9 0 NA NA NA
## 5 12 10 0 NA NA NA
## 5 13 11 0 NA NA NA
## 5 14 12 0 NA NA NA
## 5 15 13 0 NA NA NA
## 5 16 14 0 NA NA NA
## 5 17 15 0 NA NA NA
## 5 18 16 0 NA NA NA
## 5 19 17 0 NA NA NA
## 5 20 18 0 NA NA NA
## 12 2 0 0 NA NA NA
## 12 3 0 0 NA NA NA
## 12 4 0 0 NA NA NA
## 12 5 0 0 NA NA NA
## 12 6 0 0 NA NA NA
## 12 7 0 0 NA NA NA
## 12 8 0 0 NA NA NA
## 12 9 0 0 NA NA NA
## 12 10 0 0 NA NA NA
## 12 11 0 0 NA NA NA
## 12 12 3 0 NA NA NA
## 12 13 4 0 NA NA NA
## 12 14 5 0 NA NA NA
## 12 15 6 0 NA NA NA
## 12 16 7 0 NA NA NA
## 12 17 8 0 NA NA NA
## 12 18 9 0 NA NA NA
## 12 19 10 0 NA NA NA
## 12 20 11 0 NA NA NA
## ---
## Signif. codes: `*' confidence band does not cover 0
##
## Control Group: Never Treated, Anticipation Periods: 0
## Estimation Method: Doubly Robust
Callaway and Sant’Anna provide many options for aggregating the group-time effects. The simplest option is specify type = simple. This estimate considered only the clean contrasts of 1 and 2 and combines them by weighting by the total time in the post-treatment period. For contrast 1, there are 16 time periods post treatment and for contrast 2, there are 9 periods post-treatment. Thus this estimate is the weighted average: \(10.5\times(16/25) + 7\times(9/25)=9.24\)
#aggregate the group time average treatment effects
#type = "simple": weighted average of all group-time average treatment effects
#with weights proportional to the group size
s6_cs_ag <- aggte(s6_cs, type="simple")
summary(s6_cs_ag)
##
## Call:
## aggte(MP = s6_cs, type = "simple")
##
## Reference: Callaway, Brantly and Pedro H.C. Sant'Anna. "Difference-in-Differences with Multiple Time Periods." Forthcoming at the Journal of Econometrics <https://arxiv.org/abs/1803.09015>, 2020.
##
##
## Overall ATT:
## ATT Std. Error [95% Conf. Int.]
## 9.24 2.3911 4.5535 13.9265 *
##
##
## ---
## Signif. codes: `*' confidence band does not cover 0
##
## Control Group: Never Treated, Anticipation Periods: 0
## Estimation Method: Doubly Robust
Alternatively, we can specify type = "group" to get separate effect estimates according to time of implementation. Note that group denotes the time of the policy change for the treated states.
s6_cs_ag2 <- aggte(s6_cs, type = "group")
summary(s6_cs_ag2)
##
## Call:
## aggte(MP = s6_cs, type = "group")
##
## Reference: Callaway, Brantly and Pedro H.C. Sant'Anna. "Difference-in-Differences with Multiple Time Periods." Forthcoming at the Journal of Econometrics <https://arxiv.org/abs/1803.09015>, 2020.
##
##
## Overall ATT:
## ATT Std. Error [95% Conf. Int.]
## 8.75 0 8.75 8.75 *
##
##
## Group Effects:
## group ATT Std. Error [95% Simult. Conf. Band]
## 5 10.5 0 10.5 10.5 *
## 12 7.0 0 7.0 7.0 *
## ---
## Signif. codes: `*' confidence band does not cover 0
##
## Control Group: Never Treated, Anticipation Periods: 0
## Estimation Method: Doubly Robust
Alternatively, you can estimate the dynamic effect separately for each time since the treatment has been introduced. Remember that this effect estimation is also done for pre-treatment time, so don’t be alarmed by all the rows in this table!
s6_cs_ag3 <- aggte(s6_cs, type="dynamic")
summary(s6_cs_ag3)
##
## Call:
## aggte(MP = s6_cs, type = "dynamic")
##
## Reference: Callaway, Brantly and Pedro H.C. Sant'Anna. "Difference-in-Differences with Multiple Time Periods." Forthcoming at the Journal of Econometrics <https://arxiv.org/abs/1803.09015>, 2020.
##
##
## Overall ATT:
## ATT Std. Error [95% Conf. Int.]
## 10.5 0 10.5 10.5 *
##
##
## Dynamic Effects:
## event time ATT Std. Error [95% Simult. Conf. Band]
## -10 0 0 NA NA NA
## -9 0 0 NA NA NA
## -8 0 0 NA NA NA
## -7 0 0 NA NA NA
## -6 0 0 NA NA NA
## -5 0 0 NA NA NA
## -4 0 0 NA NA NA
## -3 0 0 NA NA NA
## -2 0 0 NA NA NA
## -1 0 0 NA NA NA
## 0 3 0 NA NA NA
## 1 4 0 NA NA NA
## 2 5 0 NA NA NA
## 3 6 0 NA NA NA
## 4 7 0 NA NA NA
## 5 8 0 NA NA NA
## 6 9 0 NA NA NA
## 7 10 0 NA NA NA
## 8 11 0 NA NA NA
## 9 12 0 NA NA NA
## 10 13 0 NA NA NA
## 11 14 0 NA NA NA
## 12 15 0 NA NA NA
## 13 16 0 NA NA NA
## 14 17 0 NA NA NA
## 15 18 0 NA NA NA
## ---
## Signif. codes: `*' confidence band does not cover 0
##
## Control Group: Never Treated, Anticipation Periods: 0
## Estimation Method: Doubly Robust
In this case, it is very helpful to plot the effect estimates over time:
ggdid(s6_cs_ag3)
For exposition’s sake, we also show the estimate using type = "calendar". For this specific example, I don’t love this aggregation – I think for dynamic effects it makes things a little wonky and doesn’t reveal anything super helpful:
#calendar time specific
#i don't love this one...but CS say they prefer it here
#https://bcallaway11.github.io/did/articles/did-basics.html
s6_cs_ag4 <- aggte(s6_cs, type = "calendar")
summary(s6_cs_ag4)
##
## Call:
## aggte(MP = s6_cs, type = "calendar")
##
## Reference: Callaway, Brantly and Pedro H.C. Sant'Anna. "Difference-in-Differences with Multiple Time Periods." Forthcoming at the Journal of Econometrics <https://arxiv.org/abs/1803.09015>, 2020.
##
##
## Overall ATT:
## ATT Std. Error [95% Conf. Int.]
## 8.5312 1.4594 5.6708 11.3917 *
##
##
## Time Effects:
## time ATT Std. Error [95% Simult. Conf. Band]
## 5 3.0 0.0000 NA NA NA
## 6 4.0 0.0000 NA NA NA
## 7 5.0 0.0000 NA NA NA
## 8 6.0 0.0000 NA NA NA
## 9 7.0 0.0000 NA NA NA
## 10 8.0 0.0000 NA NA NA
## 11 9.0 0.0000 NA NA NA
## 12 6.5 2.5946 NA NA NA
## 13 7.5 2.5946 NA NA NA
## 14 8.5 5.1891 NA NA NA
## 15 9.5 2.5946 NA NA NA
## 16 10.5 2.5946 NA NA NA
## 17 11.5 2.5946 NA NA NA
## 18 12.5 5.1891 NA NA NA
## 19 13.5 2.5946 NA NA NA
## 20 14.5 5.1891 NA NA NA
## ---
## Signif. codes: `*' confidence band does not cover 0
##
## Control Group: Never Treated, Anticipation Periods: 0
## Estimation Method: Doubly Robust
ggdid(s6_cs_ag4)
time_since_policy specificationBut what if we model TWFE with the time_since_change indicators?
s6_mod2 <- lm(outcome ~ time_since_policy + state + factor(time),
data = s6)
tidy(s6_mod2) %>% filter(str_detect(term, "time_since")) %>%
mutate(time = as.numeric(gsub("time_since_policy", "", term))) %>%
arrange(time)
## Warning in summary.lm(x): essentially perfect fit: summary may be unreliable
## # A tibble: 16 × 6
## term estimate std.error statistic p.value time
## <chr> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 time_since_policy1 3.00 2.72e-14 1.10e14 0 1
## 2 time_since_policy2 4.00 2.72e-14 1.47e14 0 2
## 3 time_since_policy3 5.00 2.77e-14 1.80e14 0 3
## 4 time_since_policy4 6.00 2.77e-14 2.16e14 0 4
## 5 time_since_policy5 7.00 2.77e-14 2.52e14 0 5
## 6 time_since_policy6 8.00 2.77e-14 2.88e14 0 6
## 7 time_since_policy7 9.00 2.77e-14 3.24e14 0 7
## 8 time_since_policy8 10.0 2.87e-14 3.48e14 0 8
## 9 time_since_policy9 11.0 2.87e-14 3.83e14 0 9
## 10 time_since_policy10 12.0 4.00e-14 3.00e14 0 10
## 11 time_since_policy11 13.0 4.00e-14 3.25e14 0 11
## 12 time_since_policy12 14.0 4.00e-14 3.50e14 0 12
## 13 time_since_policy13 15.0 4.00e-14 3.75e14 0 13
## 14 time_since_policy14 16.0 4.00e-14 4.00e14 0 14
## 15 time_since_policy15 17.0 4.03e-14 4.22e14 0 15
## 16 time_since_policy16 18.0 4.03e-14 4.47e14 0 16
In this setting, the regression model still works! The estimates from the policy indicator variables equal the output from the Callaway Sant’Anna when specified using type = "dynamic"
Takeway: When the treatment effect is staggered and dynamic, you can still capture the effect estimate using a TWFE model so long as the policy effect is modeled using the time since treatment variable.
Scenario 7 is like Scenario 6 except now the treatment effect is dynamic AND heterogeneous. In this scenario, the state that implements the policy change first has a stronger treatment effect (i.e., a steeper change in slope)
s7 <- read_xlsx(here("data", "scenarios.xlsx"), sheet = "scen7",
col_types = c("text", "numeric", "text", "numeric", "text", "numeric"))
Contrast 1:
Contrast 2:
Contrast 3:
Contrast 4:
Similar to the previous scenario, parallel trends is satisfied for contrasts 1-3, so we can be okay with these contrasts contributed to the average effect estimate. Like last time, contrast 4 is the problem, and here the problem is intensified because the “control” state – which was treated in an earlier period – is undergoing a large treatment effect in the “pre” period (times 5-11). This makes the pre-post different in the control state larger than the pre-post difference in the comparison state – because the control state is still experiencing a stronger (i.e., steeper slope) treatment effect than the state that is treated at time 12. This makes the DID for this group and time look negative – like the treatment is associated with a decrease in the outcome!
Let’s take a look at the regression output:
s7_mod <- lm(outcome ~ policy + state + factor(time),
data = s7)
tidy(s7_mod)
## # A tibble: 24 × 5
## term estimate std.error statistic p.value
## <chr> <dbl> <dbl> <dbl> <dbl>
## 1 (Intercept) 6.98 2.76 2.53 0.0143
## 2 policy1 6.84 2.18 3.13 0.00276
## 3 state2 1.00 1.58 0.634 0.529
## 4 state3 11.9 2.35 5.07 0.00000472
## 5 state4 8.17 1.86 4.40 0.0000496
## 6 factor(time)2 3.00 3.53 0.851 0.399
## 7 factor(time)3 6.00 3.53 1.70 0.0944
## 8 factor(time)4 9.00 3.53 2.55 0.0135
## 9 factor(time)5 11.0 3.57 3.09 0.00308
## 10 factor(time)6 14.5 3.57 4.07 0.000147
## # … with 14 more rows
The coefficient of the policy term equals 6.84. We can now use the Goodman Bacon decomposition to see how much weight each of the contrasts contributes to the effect estimate:
s7 %<>% mutate(state_n = as.numeric(as.character(state)),
policy_n = as.numeric(as.character(policy)))
s7_bacon <- bacon(outcome ~ policy_n,
data = s7,
id_var = "state_n",
time_var = "time")
## type weight avg_est
## 1 Earlier vs Later Treated 0.06715 9.00000
## 2 Later vs Earlier Treated 0.15108 -11.00000
## 3 Treated vs Untreated 0.78177 10.10429
s7_bacon
## treated untreated estimate weight type
## 2 5 99999 18 0.30695444 Treated vs Untreated
## 3 12 99999 5 0.47482014 Treated vs Untreated
## 6 12 5 -11 0.15107914 Later vs Earlier Treated
## 8 5 12 9 0.06714628 Earlier vs Later Treated
#see the TWFE estimate
sum(s7_bacon$estimate*s7_bacon$weight)
## [1] 6.841727
Now that we know how to use the Callaway and Sant’Anna code, we can apply it to this example as well. Let’s start by running the att_gt() function, remembering that the output is for every combination of group and time:
s7_cs <- att_gt(yname = "outcome",
tname = "time",
idname = "state_n",
gname = "time_first_treat",
data = s7,
anticipation = 0)
## Warning in pre_process_did(yname = yname, tname = tname, idname = idname, : Be aware that there are some small groups in your dataset.
## Check groups: 0,5,12.
## Warning in att_gt(yname = "outcome", tname = "time", idname = "state_n", : Not
## returning pre-test Wald statistic due to singular covariance matrix
summary(s7_cs)
##
## Call:
## att_gt(yname = "outcome", tname = "time", idname = "state_n",
## gname = "time_first_treat", data = s7, anticipation = 0)
##
## Reference: Callaway, Brantly and Pedro H.C. Sant'Anna. "Difference-in-Differences with Multiple Time Periods." Forthcoming at the Journal of Econometrics <https://arxiv.org/abs/1803.09015>, 2020.
##
## Group-Time Average Treatment Effects:
## Group Time ATT(g,t) Std. Error [95% Simult. Conf. Band]
## 5 2 0 0 NA NA NA
## 5 3 0 0 NA NA NA
## 5 4 0 0 NA NA NA
## 5 5 3 0 NA NA NA
## 5 6 5 0 NA NA NA
## 5 7 7 0 NA NA NA
## 5 8 9 0 NA NA NA
## 5 9 11 0 NA NA NA
## 5 10 13 0 NA NA NA
## 5 11 15 0 NA NA NA
## 5 12 17 0 NA NA NA
## 5 13 19 0 NA NA NA
## 5 14 21 0 NA NA NA
## 5 15 23 0 NA NA NA
## 5 16 25 0 NA NA NA
## 5 17 27 0 NA NA NA
## 5 18 29 0 NA NA NA
## 5 19 31 0 NA NA NA
## 5 20 33 0 NA NA NA
## 12 2 0 0 NA NA NA
## 12 3 0 0 NA NA NA
## 12 4 0 0 NA NA NA
## 12 5 0 0 NA NA NA
## 12 6 0 0 NA NA NA
## 12 7 0 0 NA NA NA
## 12 8 0 0 NA NA NA
## 12 9 0 0 NA NA NA
## 12 10 0 0 NA NA NA
## 12 11 0 0 NA NA NA
## 12 12 1 0 NA NA NA
## 12 13 2 0 NA NA NA
## 12 14 3 0 NA NA NA
## 12 15 4 0 NA NA NA
## 12 16 5 0 NA NA NA
## 12 17 6 0 NA NA NA
## 12 18 7 0 NA NA NA
## 12 19 8 0 NA NA NA
## 12 20 9 0 NA NA NA
## ---
## Signif. codes: `*' confidence band does not cover 0
##
## Control Group: Never Treated, Anticipation Periods: 0
## Estimation Method: Doubly Robust
Let’s aggregate these ATTs using type = simple. This estimate considered only the clean contrasts of 1 and 2 and combines them by weighting by the total time in the post-treatment period. For contrast 1, there are 16 time periods post treatment and for contrast 2, there are 9 periods post-treatment. Thus this estimate is the weighted average: \(18\times(16/25) + 5\times(9/25)=13.32\)
#aggregate the group time average treatment effects
#type = "simple": weighted average of all group-time average treatment effects
#with weights proportional to the group size
s7_cs_ag <- aggte(s7_cs, type="simple")
summary(s7_cs_ag)
##
## Call:
## aggte(MP = s7_cs, type = "simple")
##
## Reference: Callaway, Brantly and Pedro H.C. Sant'Anna. "Difference-in-Differences with Multiple Time Periods." Forthcoming at the Journal of Econometrics <https://arxiv.org/abs/1803.09015>, 2020.
##
##
## Overall ATT:
## ATT Std. Error [95% Conf. Int.]
## 13.32 4.4407 4.6164 22.0236 *
##
##
## ---
## Signif. codes: `*' confidence band does not cover 0
##
## Control Group: Never Treated, Anticipation Periods: 0
## Estimation Method: Doubly Robust
Alternatively, we can specify type = "group" to get separate effect estimates according to time of implementation and see that these match our by-hand calculations:
s7_cs_ag2 <- aggte(s7_cs, type = "group")
summary(s7_cs_ag2)
##
## Call:
## aggte(MP = s7_cs, type = "group")
##
## Reference: Callaway, Brantly and Pedro H.C. Sant'Anna. "Difference-in-Differences with Multiple Time Periods." Forthcoming at the Journal of Econometrics <https://arxiv.org/abs/1803.09015>, 2020.
##
##
## Overall ATT:
## ATT Std. Error [95% Conf. Int.]
## 11.5 0 11.5 11.5 *
##
##
## Group Effects:
## group ATT Std. Error [95% Simult. Conf. Band]
## 5 18 0 18 18 *
## 12 5 0 5 5 *
## ---
## Signif. codes: `*' confidence band does not cover 0
##
## Control Group: Never Treated, Anticipation Periods: 0
## Estimation Method: Doubly Robust
Alternatively, you can estimate the dynamic effect separately for each time since the treatment has been introduced. Remember that this effect estimation is also done for pre-treatment time, so don’t be alarmed by all the rows in this table!
s7_cs_ag3 <- aggte(s7_cs, type="dynamic")
summary(s7_cs_ag3)
##
## Call:
## aggte(MP = s7_cs, type = "dynamic")
##
## Reference: Callaway, Brantly and Pedro H.C. Sant'Anna. "Difference-in-Differences with Multiple Time Periods." Forthcoming at the Journal of Econometrics <https://arxiv.org/abs/1803.09015>, 2020.
##
##
## Overall ATT:
## ATT Std. Error [95% Conf. Int.]
## 16.3125 1.2509 13.8607 18.7643 *
##
##
## Dynamic Effects:
## event time ATT Std. Error [95% Simult. Conf. Band]
## -10 0.0 0.0000 NA NA NA
## -9 0.0 0.0000 NA NA NA
## -8 0.0 0.0000 NA NA NA
## -7 0.0 0.0000 NA NA NA
## -6 0.0 0.0000 NA NA NA
## -5 0.0 0.0000 NA NA NA
## -4 0.0 0.0000 NA NA NA
## -3 0.0 0.0000 NA NA NA
## -2 0.0 0.0000 NA NA NA
## -1 0.0 0.0000 NA NA NA
## 0 2.0 0.7413 NA NA NA
## 1 3.5 0.0000 NA NA NA
## 2 5.0 2.9652 NA NA NA
## 3 6.5 1.8533 NA NA NA
## 4 8.0 4.4478 NA NA NA
## 5 9.5 0.0000 NA NA NA
## 6 11.0 5.9304 NA NA NA
## 7 12.5 3.3359 NA NA NA
## 8 14.0 3.7065 NA NA NA
## 9 21.0 0.0000 NA NA NA
## 10 23.0 0.0000 NA NA NA
## 11 25.0 0.0000 NA NA NA
## 12 27.0 0.0000 NA NA NA
## 13 29.0 0.0000 NA NA NA
## 14 31.0 0.0000 NA NA NA
## 15 33.0 0.0000 NA NA NA
## ---
## Signif. codes: `*' confidence band does not cover 0
##
## Control Group: Never Treated, Anticipation Periods: 0
## Estimation Method: Doubly Robust
In this case, it is very helpful to plot the effect estimates over time:
ggdid(s7_cs_ag3)
For exposition’s sake, let’s also show the estimate using type = "calendar". For this specific example, I don’t love this aggregation – I think for dynamic effects it makes things a little wonky and doesn’t reveal anything super helpful:
#calendar time specific
#i don't love this one...but CS say they prefer it here
#https://bcallaway11.github.io/did/articles/did-basics.html
s7_cs_ag4 <- aggte(s7_cs, type = "calendar")
summary(s7_cs_ag4)
##
## Call:
## aggte(MP = s7_cs, type = "calendar")
##
## Reference: Callaway, Brantly and Pedro H.C. Sant'Anna. "Difference-in-Differences with Multiple Time Periods." Forthcoming at the Journal of Econometrics <https://arxiv.org/abs/1803.09015>, 2020.
##
##
## Overall ATT:
## ATT Std. Error [95% Conf. Int.]
## 12.375 4.1698 4.2023 20.5477 *
##
##
## Time Effects:
## time ATT Std. Error [95% Simult. Conf. Band]
## 5 3.0 0.0000 NA NA NA
## 6 5.0 0.0000 NA NA NA
## 7 7.0 0.0000 NA NA NA
## 8 9.0 0.0000 NA NA NA
## 9 11.0 0.0000 NA NA NA
## 10 13.0 0.0000 NA NA NA
## 11 15.0 0.0000 NA NA NA
## 12 9.0 11.8608 NA NA NA
## 13 10.5 12.6021 NA NA NA
## 14 12.0 6.6717 NA NA NA
## 15 13.5 14.0847 NA NA NA
## 16 15.0 14.8260 NA NA NA
## 17 16.5 15.5673 NA NA NA
## 18 18.0 8.1543 NA NA NA
## 19 19.5 8.5250 NA NA NA
## 20 21.0 8.8956 NA NA NA
## ---
## Signif. codes: `*' confidence band does not cover 0
##
## Control Group: Never Treated, Anticipation Periods: 0
## Estimation Method: Doubly Robust
ggdid(s7_cs_ag4)
time_since_policy specificationBut what if model TWFE with the time_since_change indicators? Here I run the regression and pull out the effect estimates since the regression output is so large
s7_mod2 <- lm(outcome ~ time_since_policy + state + factor(time),
data = s7)
tidy(s7_mod2) %>% filter(str_detect(term, "time_since")) %>%
mutate(time = as.numeric(gsub("time_since_policy", "", term))) %>%
arrange(time)
## # A tibble: 16 × 6
## term estimate std.error statistic p.value time
## <chr> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 time_since_policy1 0.621 1.13 0.547 5.87e- 1 1
## 2 time_since_policy2 2.06 1.13 1.81 7.73e- 2 2
## 3 time_since_policy3 4.11 1.16 3.56 9.55e- 4 3
## 4 time_since_policy4 5.64 1.16 4.88 1.63e- 5 4
## 5 time_since_policy5 7.17 1.16 6.21 2.19e- 7 5
## 6 time_since_policy6 8.70 1.16 7.53 2.96e- 9 6
## 7 time_since_policy7 10.2 1.16 8.85 4.59e-11 7
## 8 time_since_policy8 11.7 1.20 9.77 2.89e-12 8
## 9 time_since_policy9 13.3 1.20 11.1 6.07e-14 9
## 10 time_since_policy10 18.7 1.67 11.2 5.00e-14 10
## 11 time_since_policy11 20.8 1.67 12.5 1.48e-15 11
## 12 time_since_policy12 23.0 1.67 13.8 5.41e-17 12
## 13 time_since_policy13 25.2 1.67 15.1 2.41e-18 13
## 14 time_since_policy14 27.4 1.67 16.4 1.29e-19 14
## 15 time_since_policy15 29.5 1.68 17.6 1.04e-20 15
## 16 time_since_policy16 31.7 1.68 18.9 7.43e-22 16
The effect estimates on the policy indicators do not equal an average of the ATTs from the two states for each time period. For this example, this specification gives estimates that are systematically lower than those produced by the Callaway Sant’Anna model.
Takeway: When the treatment effect is staggered, dynamic, and hetergeneous, the TWFE regression specification will produced biased results, as will the specification using a categorical variable for time since treatment. In this case, it is best to use another approach like the Callaway Sant’Anna function.
# s8 <- read_xlsx(here("data", "scenarios.xlsx"), sheet = "scen8",
# col_types = c("text", "text", "text", "numeric"))
#
# s8 %<>% mutate(state_n = as.numeric(as.character(state)),
# policy_n = as.numeric(as.character(policy)))
# s8_bacon <- bacon(formula = outcome ~ policy_n,
# data = s8,
# id_var = "state_n",
# time_var = "time")
```